feature channel
- North America > United States > New York > Monroe County > Rochester (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Asia > China > Anhui Province > Hefei (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States > New York > Monroe County > Rochester (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- (2 more...)
Learning Semantic-aware Normalization for Generative Adversarial Networks
The recent advances in image generation have been achieved by style-based image generators. Such approaches learn to disentangle latent factors in different image scales and encode latent factors as "style" to control image synthesis. However, existing approaches cannot further disentangle fine-grained semantics from each other, which are often conveyed from feature channels. In this paper, we propose a novel image synthesis approach by learning Semantic-aware relative importance for feature channels in Generative Adversarial Networks (SariGAN).
Adaptive Diffusion in Graph Neural Networks
The success of graph neural networks (GNNs) largely relies on the process of aggregating information from neighbors defined by the input graph structures. Notably, message passing based GNNs, e.g., graph convolutional networks, leverage the immediate neighbors of each node during the aggregation process, and recently, graph diffusion convolution (GDC) is proposed to expand the propagation neighborhood by leveraging generalized graph diffusion. However, the neighborhood size in GDC is manually tuned for each graph by conducting grid search over the validation set, making its generalization practically limited. To address this issue, we propose the adaptive diffusion convolution (ADC) strategy to automatically learn the optimal neighborhood size from the data. Furthermore, we break the conventional assumption that all GNN layers and feature channels (dimensions) should use the same neighborhood for propagation. We design strategies to enable ADC to learn a dedicated propagation neighborhood for each GNN layer and each feature channel, making the GNN architecture fully coupled with graph structures---the unique property that differs GNNs from traditional neural networks. By directly plugging ADC into existing GNNs, we observe consistent and significant outperformance over both GDC and their vanilla versions across various datasets, demonstrating the improved model capacity brought by automatically learning unique neighborhood size per layer and per channel in GNNs.
Generating Separated Singing Vocals Using a Diffusion Model Conditioned on Music Mixtures
Plaja-Roglans, Genís, Hung, Yun-Ning, Serra, Xavier, Pereira, Igor
Separating the individual elements in a musical mixture is an essential process for music analysis and practice. While this is generally addressed using neural networks optimized to mask or transform the time-frequency representation of a mixture to extract the target sources, the flexibility and generalization capabilities of generative diffusion models are giving rise to a novel class of solutions for this complicated task. In this work, we explore singing voice separation from real music recordings using a diffusion model which is trained to generate the solo vocals conditioned on the corresponding mixture. Our approach improves upon prior generative systems and achieves competitive objective scores against non-generative baselines when trained with supplementary data. The iterative nature of diffusion sampling enables the user to control the quality-efficiency trade-off, and also refine the output when needed. We present an ablation study of the sampling algorithm, highlighting the effects of the user-configurable parameters.
- Media (0.67)
- Leisure & Entertainment (0.67)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Poland > Podlaskie Province > Bialystok (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Visual Anomaly Detection for Reliable Robotic Implantation of Flexible Microelectrode Array
Chen, Yitong, Xu, Xinyao, Zhu, Ping, Han, Xinyong, Qin, Fangbo, Yu, Shan
Flexible microelectrode (FME) implantation into brain cortex is challenging due to the deformable fiber-like structure of FME probe and the interaction with critical bio-tissue. To ensure reliability and safety, the implantation process should be monitored carefully. This paper develops an image-based anomaly detection framework based on the microscopic cameras of the robotic FME implantation system. The unified framework is utilized at four checkpoints to check the micro-needle, FME probe, hooking result, and implantation point, respectively. Exploiting the existing object localization results, the aligned regions of interest (ROIs) are extracted from raw image and input to a pretrained vision transformer (ViT). Considering the task specifications, we propose a progressive granularity patch feature sampling method to address the sensitivity-tolerance trade-off issue at different locations. Moreover, we select a part of feature channels with higher signal-to-noise ratios from the raw general ViT features, to provide better descriptors for each specific scene. The effectiveness of the proposed methods is validated with the image datasets collected from our implantation system.
FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks
Kim, Jaemin, Um, Hongjun, Kim, Sungkyun, Park, Yongjun, Seo, Jiwon
Neural networks commonly execute on hardware accelerators such as NPUs and GPUs for their size and computation overhead. These accelerators are costly and it is hard to scale their resources to handle real-time workload fluctuations. We present FlexiQ, an adaptive mixed-precision quantization scheme for computer vision models. FlexiQ selectively applies low-bitwidth computation to feature channels with small value ranges and employs an efficient bit-lowering method to minimize quantization errors while maintaining inference accuracy. Furthermore, FlexiQ adjusts its low-bitwidth channel ratio in real time, enabling quantized models to effectively manage fluctuating inference workload. We implemented FlexiQ prototype, including the mixed-precision inference runtime on our custom NPU and GPUs. Evaluated on eleven convolution- and transformer-based vision models, FlexiQ achieves on average 6.6% higher accuracy for 4-bit models with finetuning and outperforms four state-of-the-art quantization techniques. Moreover, our mixed-precision models achieved an efficient accuracy-latency trade-off, with the 50% 4-bit model incurring only 0.6% accuracy loss while achieving 40% of the speedup of the 100% 4-bit model over 8-bit model. Latency evaluations on our NPU and GPUs confirmed that FlexiQ introduces minimal runtime overhead, demonstrating its hardware efficiency and overall performance benefits.
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.05)
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > United States > Washington > King County > Renton (0.04)
- (6 more...)